Aggregation (linguistics)
   HOME

TheInfoList



OR:

In
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
, aggregation is a subtask of
natural language generation Natural language generation (NLG) is a software process that produces natural language output. A widely cited survey of NLG methods describes NLG as "the subfield of artificial intelligence and computational linguistics that is concerned with the ...
, which involves merging syntactic constituents (such as sentences and
phrase In grammar, a phrasecalled expression in some contextsis a group of words or singular word acting as a grammatical unit. For instance, the English language, English expression "the very happy squirrel" is a noun phrase which contains the adject ...
s) together. Sometimes aggregation can be done at a conceptual level.


Examples

A simple example of syntactic aggregation is merging the two sentences ''John went to the shop'' and ''John bought an apple'' into the single sentence ''John went to the shop and bought an apple''. Syntactic aggregation can be much more complex than this. For example, aggregation can embed one of the constituents in the other; e.g., we can aggregate ''John went to the shop'' and ''The shop was closed'' into the sentence ''John went to the shop, which was closed''. From a pragmatic perspective, aggregating sentences together often suggests to the reader that these sentences are related to each other. If this is not the case, the reader may be confused. For example, someone who reads ''John went to the shop and bought an apple'' may infer that the apple was bought in the shop; if this is not the case, then these sentences should not be aggregated.


Algorithms and issues

Aggregation algorithms must do two things: * Decide when two constituents should be aggregated * Decide how two constituents should be aggregated, and create the aggregated structure The first issue, deciding when to aggregate, is poorly understood. Aggegration decisions certainly depend on the
semantic Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
relations between the constituents, as mentioned above; they also depend on the
genre Genre () is any style or form of communication in any mode (written, spoken, digital, artistic, etc.) with socially agreed-upon conventions developed over time. In popular usage, it normally describes a category of literature, music, or other fo ...
(e.g., bureaucratic texts tend to be more aggregated than instruction manuals). They probably should depend on rhetorical and discourse structure. The
literacy Literacy is the ability to read and write, while illiteracy refers to an inability to read and write. Some researchers suggest that the study of "literacy" as a concept can be divided into two periods: the period before 1950, when literacy was ...
level of the reader is also probably important (poor readers need shorter sentences). But we have no integrated model which brings all these factors together into a single
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
. With regard to the second issue, there have been some studies of different types of aggregation, and how they should be carried out. Harbusch and Kempen describe several syntactic aggregation strategies. In their terminology, ''John went to the shop and bought an apple'' is an example of forward conjunction Reduction Much less is known about conceptual aggregation. Di Eugenio ''et al.'' show how conceptual aggregation can be done in an intelligent tutoring system, and demonstrate that performing such aggregation makes the system more effective (and that conceptual aggregation make a bigger impact than syntactic aggregation).


Software

Unfortunately there is not much software available for performing aggregation. However the SimpleNLG systemA Gatt and E Reiter (2009). SimpleNLG: A realisation engine for practical applications. ''Proceedings of ENLG09'

/ref> does include limited support for basic aggregation. For example, the following code causes SimpleNLG to print out ''The man is hungry and buys an apple''. SPhraseSpec s1 = nlgFactory.createClause("the man", "be", "hungry"); SPhraseSpec s2 = nlgFactory.createClause("the man", "buy", "an apple"); NLGElement result = new ClauseCoordinationRule().apply(s1, s2); System.out.println(realiser.realiseSentence(result));


References


External links


simplenlg at GoogleCode
{{DEFAULTSORT:Aggregation (Linguistics) Computational linguistics Natural language processing